Syntactic SMT Using a Discriminative Text Generation Model
نویسندگان
چکیده
We study a novel architecture for syntactic SMT. In contrast to the dominant approach in the literature, the system does not rely on translation rules, but treat translation as an unconstrained target sentence generation task, using soft features to capture lexical and syntactic correspondences between the source and target languages. Target syntax features and bilingual translation features are trained consistently in a discriminative model. Experiments using the IWSLT 2010 dataset show that the system achieves BLEU comparable to the state-of-the-art syntactic SMT systems.
منابع مشابه
A Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation
Word alignment plays an important role in statistical machine translation (SMT) systems. The output of word alignment can be used to build a phrase table, which is the core model in the decoding of new sentences. Most current SMT systems use GIZA++, a generative model, to automatically align words from sentence-aligned parallel corpora. GIZA++ works well when large sentence-aligned corpora are ...
متن کاملDiscriminative Reranking for Grammatical Error Correction with Statistical Machine Translation
Research on grammatical error correction has received considerable attention. For dealing with all types of errors, grammatical error correction methods that employ statistical machine translation (SMT) have been proposed in recent years. An SMT system generates candidates with scores for all candidates and selects the sentence with the highest score as the correction result. However, the 1-bes...
متن کاملA Discriminative Latent Variable-Based "DE" Classifier for Chinese-English SMT
Syntactic reordering on the source-side is an effective way of handling word order differences. The { (DE) construction is a flexible and ubiquitous syntactic structure in Chinese which is a major source of error in translation quality. In this paper, we propose a new classifier model — discriminative latent variable model (DPLVM) — to classify the DE construction to improve the accuracy of the...
متن کاملA Discriminative Lexicon Model for Complex Morphology
This paper describes successful applications of discriminative lexicon models to the statistical machine translation (SMT) systems into morphologically complex languages. We extend the previous work on discriminatively trained lexicon models to include more contextual information in making lexical selection decisions by building a single global log-linear model of translation selection. In offl...
متن کاملPractical and Efficient Incorporation of Syntactic Features into Statistical Language Models
Automatic Speech Recognition (ASR) and Statistical Machine Translation (SMT), among other natural language processing applications, rely on a language model (LM) to provide a strong linguistic prior over word sequences of the often prohibitively large and complex hypothesis space of these systems. The language models deployed in most state-of-the-art ASR and SMT systems are n-gram models. Sever...
متن کامل